Methods for Building Sense Inventories of Abbreviations in Clinical Notes

نویسندگان

  • Hua Xu
  • Peter D. Stetson
  • Carol Friedman
چکیده

OBJECTIVE To develop methods for building corpus-specific sense inventories of abbreviations occurring in clinical documents. DESIGN A corpus of internal medicine admission notes was collected and instances of each clinical abbreviation in the corpus were clustered to different sense clusters. One instance from each cluster was manually annotated to generate a final list of senses. Two clustering-based methods (Expectation Maximization--EM and Farthest First--FF) and one random sampling method for sense detection were evaluated using a set of 12 clinical abbreviations. MEASUREMENTS The clustering-based sense detection methods were evaluated using a set of clinical abbreviations that were manually sense annotated. "Sense Completeness" and "Annotation Cost" were used to measure the performance of different methods. Clustering error rates were also reported for different clustering algorithms. RESULTS A clustering-based semi-automated method was developed to build corpus-specific sense inventories for abbreviations in hospital admission notes. Evaluation demonstrated that this method could largely reduce manual annotation cost and increase the completeness of sense inventories when compared with a manual annotation method using random samples. CONCLUSION The authors developed an effective clustering-based method for building corpus-specific sense inventories for abbreviations in a clinical corpus. To the best of the authors knowledge, this is the first time clustering technologies have been used to help building sense inventories of abbreviations in clinical text. The results demonstrated that the clustering-based method performed better than the manual annotation method using random samples for the task of building sense inventories of clinical abbreviations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Abbreviations in Clinical Notes

Various natural language processing (NLP) systems have been developed to unlock patient information from narrative clinical notes in order to support knowledge based applications such as error detection, surveillance and decision support. In many clinical notes, abbreviations are widely used without mention of their definitions, which is very different from the use of abbreviations in the biome...

متن کامل

A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources

OBJECTIVE To create a sense inventory of abbreviations and acronyms from clinical texts. METHODS The most frequently occurring abbreviations and acronyms from 352,267 dictated clinical notes were used to create a clinical sense inventory. Senses of each abbreviation and acronym were manually annotated from 500 random instances and lexically matched with long forms within the Unified Medical L...

متن کامل

Challenges and Practical Approaches with Word Sense Disambiguation of Acronyms and Abbreviations in the Clinical Domain

OBJECTIVES Although acronyms and abbreviations in clinical text are used widely on a daily basis, relatively little research has focused upon word sense disambiguation (WSD) of acronyms and abbreviations in the healthcare domain. Since clinical notes have distinctive characteristics, it is unclear whether techniques effective for acronym and abbreviation WSD from biomedical literature are suffi...

متن کامل

Translation of Acronyms, Initialisms and Abbreviations (AIA) in Persian Political and Sport Journalistic Texts

The different writing systems of English and Persian makes translation of acronyms, initialisms and abbreviations challenging. This study aimed at finding which strategies were applied most frequently in translating acronyms, initialisms and abbreviations from English to Persian especially in journalistic texts. The study was done based n Descriptive Translation Study of Toury and strategies pr...

متن کامل

Abbreviation and Acronym Disambiguation in Clinical Discourse

Use of abbreviations and acronyms is pervasive in clinical reports despite many efforts to limit the use of ambiguous and unsanctioned abbreviations and acronyms. Due to the fact that many abbreviations and acronyms are ambiguous with respect to their sense, complete and accurate text analysis is impossible without identification of the sense that was intended for a given abbreviation or acrony...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 2008